39 research outputs found

    QB4OLAP : Enabling business intelligence over semantic web data

    Get PDF
    Premio Primer puesto otorgado por la Academia Nacional de Ingeniería.The World-Wide Web was initially conceived as a repository of information tailored for human consumption. In the last decade, the idea of transforming the web into a machine-understandable web of data, has gained momentum. To this end, the World Wide Web Consortium (W3C) maintains a set of standards, referred to as the Semantic Web (SW), which allow to openly share data and metadata. Among these is the Resource Description Framework (RDF), which represents data as graphs, RDF-S and OWL to describe the data structure via ontologies or vocabularies, and SPARQL, the RDF query language. On top of the RDF data model, standards and recommendations can be built to represent data that adheres to other models. The multidimensional (MD) model views data in an n-dimensional space, usually called a data cube, composed of dimensions and facts. The former reflect the perspectives from which data are viewed, and the latter correspond to points in this space, associated with (usually) quantitative data (also known as measures). Facts can be aggregated, disaggregated, and filtered using the dimensions. This process is called Online Analytical Processing (OLAP). Despite the RDF Data Cube Vocabulary (QB) is the W3C standard to represent statistical data, which resembles MD data, it does not include key features needed for OLAP analysis, like dimension hierarchies, dimension level attributes, and aggregate functions. To enable this kind of analysis over SW data cubes, in this thesis we propose the QB4 OLAP vocabulary, an extension of QB. A problem remains, however: writing efficient analytical queries over SW data cubes requires a deep knowledge of RDF and SPARQL, unlikely to be found in typical OLAP users. We address this problem in this thesis. Our approach is based on allowing analytical users to write queries using what they know best: OLAP operations over data cubes, without dealing with SW technicalities. For this, we devised CQL, a simple, high-level query language over data cubes. Then we make use of the structural metadata provided by QB4 OLAP to translate CQL queries into SPARQL ones. We adapt general-purpose SPARQL query optimization techniques, and propose query improvement strategies to produce efficient SPARQL queries. We evaluate our implementation tailoring the well known Star-Schema benchmark, which allows us to compare our proposal against existing ones in a fair way. We show that our approach outperforms other ones. Finally, as another result, our experiments allow us to study which combinations of improvement strategies fits better to an analytical scenario.La World-Wide Web fue concebida como un repositorio de informa- ción a ser procesada y consumida por humanos. Pero en la última década ha ganado impulso la idea de transformar a la Web en una gran base de datos procesables por máquinas. Con este fin, el World Wide Web Consortium (W3C) ha establecido una serie de estándares también conocidos como estándares para la Web Semántica (WS), los cuales permiten compartir datos y metadatos en formatos abiertos. Entre estos estándares se destacan: el Resource Description Framework (RDF), un modelo de datos basado en grafos para representar datos y relaciones entre ellos, RDF-S y OWL que permiten describir la estructura y el significado de los datos por medio de ontologías o vocabu- larios, y el lenguaje de consultas SPARQL. Estos estándares pueden ser utilizados para construir representaciones de otros modelos de datos, por ejemplo datos tabulares o datos relacionales. El modelo de datos multidimensional (MD) representa a los datos dentro de un espacio n-dimensional, usualmente denominado cubo de datos, que se compone de dimensiones y hechos. Las primeras reflejan las perspectivas desde las cuales interesa analizar los datos, mientras que las segundas corresponden a puntos en este espacio n- dimensional, a los cuales se asocian valores usualmente numéricos, conocidos como medidas. Los hechos pueden ser agregados y resumidos, desagregados, y filtrados utilizando las dimensiones. Este pro- ceso es conocido como Online Analytical Processing (OLAP). Pese a que la W3C ha establecido un estándar que puede ser utilizado para publicación de datos multidimensionales, conocido como el RDF Data Cube Vocabulary (QB), éste no incluye algunos aspectos del modelo MD que son imprescindibles para realizar análisis tipo OLAP como son las jerarquías de dimensión, los atributos en los niveles de dimensión, y las funciones de agregaciónpara resumir valores de medidas. Para permitir este tipo de análisis sobre cubos en la SW, en esta tesis se propone un vocabulario que extiende el vocabulario QB denominado QB4OLAP. Sin embargo, para realizar análisis tipo OLAP en forma eficiente sobre cubos QB4OLAP es necesario un conocimiento profundo de RDF y SPARQL, los cuales distan de ser populares entre los usuarios OLAP típicos. Esta tesis también aborda este problema. Nuestro enfoque consiste en brindar un conjunto de operaciones clásicas para los usuarios OLAP, y luego realizar la traducción en forma automática de estas operaciones en consultas SPARQL. Comenzamos definiendo un lenguaje de consultas para cubos en alto nivel: Cube Query Language (CQL), y luego explotamos la metadata representada mediante QB4OLAP para realizar la traducción a SPARQL. Asimismo, mejoramos el rendimiento de las consultas obtenidas, adaptando y aplicando técnicas existentes de optimización de consultas SPARQL. Para evaluar nuestra propuesta adaptamos a los estándares de la SW el Star Schema benchmark, el cual es el estándar para la evaluación de sistemas tipo OLAP. Esto permite comparar nuestro enfoque con otras propuestas existentes, asi como evaluar el impacto de nuestras estrategias de mejoras de consultas SPARQL. De esta comparación podemos concluir que nuestro enfoque supera a otras propuestas existentes, y que nuestras técnicas de mejoras logran incrementar en 10 veces el rendimiento del sistema

    QB2OLAP : enabling OLAP on statistical linked open data

    Get PDF
    Publication and sharing of multidimensional (MD) data on the Semantic Web (SW) opens new opportunities for the use of On-Line Analytical Processing (OLAP). The RDF Data Cube (QB) vocabulary, the current standard for statistical data publishing, however, lacks key MD concepts such as dimension hierarchies and aggregate functions. QB4OLAP was proposed to remedy this. However, QB4OLAP requires extensive manual annotation and users must still write queries in SPARQL, the standard query language for RDF, which typical OLAP users are not familiar with. In this demo, we present QB2OLAP, a tool for enabling OLAP on existing QB data. Without requiring any RDF, QB(4OLAP), or SPARQL skills, it allows semi-automatic transformation of a QB data set into a QB4OLAP one via enrichment with QB4OLAP semantics, exploration of the enriched schema, and querying with the high-level OLAP language QL that exploits the QB4OLAP semantics and is automatically translated to SPARQL.Peer ReviewedPostprint (author's final draft

    Dimensional enrichment of statistical linked open data

    Get PDF
    On-Line Analytical Processing (OLAP) is a data analysis technique typically used for local and well-prepared data. However, initiatives like Open Data and Open Government bring new and publicly available data on the web that are to be analyzed in the same way. The use of semantic web technologies for this context is especially encouraged by the Linked Data initiative. There is already a considerable amount of statistical linked open data sets published using the RDF Data Cube Vocabulary (QB) which is designed for these purposes. However, QB lacks some essential schema constructs (e.g., dimension levels) to support OLAP. Thus, the QB4OLAP vocabulary has been proposed to extend QB with the necessary constructs and be fully compliant with OLAP. In this paper, we focus on the enrichment of an existing QB data set with QB4OLAP semantics. We first thoroughly compare the two vocabularies and outline the benefits of QB4OLAP. Then, we propose a series of steps to automate the enrichment of QB data sets with specific QB4OLAP semantics; being the most important, the definition of aggregate functions and the detection of new concepts in the dimension hierarchy construction. The proposed steps are defined to form a semi-automatic enrichment method, which is implemented in a tool that enables the enrichment in an interactive and iterative fashion. The user can enrich the QB data set with QB4OLAP concepts (e.g., full-fledged dimension hierarchies) by choosing among the candidate concepts automatically discovered with the steps proposed. Finally, we conduct experiments with 25 users and use three real-world QB data sets to evaluate our approach. The evaluation demonstrates the feasibility of our approach and shows that, in practice, our tool facilitates, speeds up, and guarantees the correct results of the enrichment process.Peer ReviewedPostprint (author's final draft

    Overcoming data scarcity in earth science.

    Get PDF
    The Data Scarcity problem is repeatedly encountered in environmental research. This may induce an inadequate representation of the response?s complexity in any environmental system to any input/change (natural and human-induced). In such a case, before getting engaged with new expensive studies to gather and analyze additional data, it is reasonable first to understand what enhancement in estimates of system performance would result if all the available data could be well exploited. The purpose of this Special Issue, "Overcoming Data Scarcity in Earth Science" in the Data journal, is to draw attention to the body of knowledge that leads at improving the capacity of exploiting the available data to better represent, understand, predict, and manage the behavior of environmental systems at meaningful space-time scales. This Special Issue contains six publications (three research articles, one review, and two data descriptors) covering a wide range of environmental fields: geophysics, meteorology/climatology, ecology, water quality, and hydrology

    Uruguay’s COVID-19 contact tracing app reveals the growing importance of data governance frameworks

    Get PDF
    Uruguay’s pioneering adoption of Google and Apple’s contact tracing interface is understandable given the urgent need to halt the spread of COVID-19. But this move also puts serious issues of governance, health policy, and human rights in the hands of software developers who have neither the expertise nor the legitimacy required to properly address them. During today’s crisis just as in the future, governments must ask the right questions about data governance if they are to come up with the right policies, write Fabrizio Scrollini (ILDA), Javier Baliosian (Universidad de la República), Lorena Etcheverry (Universidad de la República), and Guillermo Moncecchi (Universidad de la República)

    Análisis del proceso de carga del sistema de data warehousing de enseñanza de la Facultad de Ingeniería

    Get PDF
    En este documento se describe el proceso de ETL del Sistema de Data Warehousing de Enseñanza de la Facultad de Ingeniería. Se analiza dicho proceso en tres niveles de abstracción, mostrando el flujo de datos en un macro-nivel que identifica los subprocesos que intervienen, pasando por un nivel intermedio que detalla desde un punto de vista conceptual las actividades en ciertos subprocesos hasta llegar a un nivel de detalle que expresa las operaciones involucradas en dichos subprocesos. El objetivo principal de este trabajo es describir y agrupar las actividades de ETL para un posterior análisis de la propagación de propiedades de calidad en dicho proceso

    REVISIÓN SISTEMÁTICA DE ESTRATEGIAS DE AFRONTAMIENTO EN CUIDADORES PRINCIPALES DE PERSONAS CON DEMENCIA

    Get PDF
    Resumen: Las estrategias de afrontamiento son una serie de pensamientos y acciones que permite al individuo manejar situaciones difíciles donde se encuentra involucrados procesos cognitivos, emocionales y conductuales. El presente estudio de revisión sistemática analiza las estrategias de afrontamiento en cuidadores principales de personas con diagnóstico de demencia. Mediante la metodología PRISMA-NMA, se seleccionaron 24 artículos de las bases PubMed, bvsalud, SciELO, Redalyc y DIALNET, con un total de 3726 cuidadores evaluados. Los resultados, muestran que predomina el uso de estrategias de afrontamiento orientadas a lo emocional, tales como la búsqueda de apoyo social y religión/espiritualidad, que se asocian a reducción de depresión y carga, así como una mejor salud mental, salud física, calidad de vida y bienestar psicológico. Se concluye sobre la necesidad de planificar intervenciones, enfocadas en entrenar estrategias adaptativas, orientadas a lo emocional con el objetivo de regular las emociones asociadas con la situación estresante, y evitar el afrontamiento pasivo o evitativo.Abstract: Coping strategies are a series of thoughts and actions that allow the individual to handle difficult situations where cognitive, emotional and behavioral processes are involved. This systematic review study analyzes coping strategies in primary caregivers of people diagnosed with dementia. Using the PRISMA-NMA methodology, 24 articles were selected from the PubMed, bvsalud, SciELO, Redalyc and DIALNET databases, with a total of 3726 caregivers evaluated. The results show that the use of emotionally oriented coping strategies predominates, such as the search for social support and religion/spirituality, which are associated with a reduction in depression and burden, as well as better mental health, physical health, quality of life and psychological well-being. It is concluded on the need to plan interventions, focused on training adaptive strategies, emotionally oriented with the aim of regulating the emotions associated with the stressful situation, and avoiding passive or avoidant coping
    corecore